Skip to content

Conversation

@juliocesar-io
Copy link

@juliocesar-io juliocesar-io commented Aug 28, 2024

Overview

This PR introduces a fully featured Local Notebook for performing inference, obtaining metrics, ranking the best model, and generating plots in a structured and reproducible manner, particularly for experimentation with large datasets.

The metrics are similar to those in the Colab notebook but optimized for a local installation with Docker. It also introduces parallel execution to leverage multiple GPUs.

The notebook operates by executing Docker commands using the Docker client and accessing OpenFold functions within a standalone environment. This approach ensures that the OpenFold codebase remains unaffected, serving as a client to help reproduce metrics and results from the Colab notebook locally.

Usage

Refer to instructions in notebooks/OpenFoldLocal.ipynb

Setup the notebook

Fist, build Openfold using Docker. Follow this guide.

Then, go to the notebook folder

cd notebooks

Create an environment to run Jupyter with the requirements

mamba create -n openfold_notebook python==3.10

Activate the environment

mamba activate openfold_notebook

Install the requirements

pip install -r src/requirements.txt

Start your Jupyter server in the current folder

jupyter lab . --ip="0.0.0.0"

Access the notebook URL or connect remotely using VSCode.

Inference example

Initializing the client:

import docker
from src.inference import InferenceClientOpenFold

# You can also use a remote docker server 
docker_client = docker.from_env()

# Initialize the OpenFold Docker client setting the database path 

databases_dir = "/path/to/databases"

openfold_client = InferenceClientOpenFold(databases_dir, docker_client)

Running Inference:

# For multiple sequences, separate sequences with a colon `:`
input_string = "DAGAQGAAIGSPGVLSGNVVQVPVHVPVNVCGNTVSVIGLLNPAFGNTCVNA:AGETGRTGVLVTSSATNDGDSGWGRFAG"

model_name = "multimer" # or "monomer"
weight_set = 'AlphaFold' # or 'OpenFold'

# Run inference
run_id = openfold_client.run_inference(weight_set, model_name, inference_input=input_string)

Using a file:

input_file = "/path/to/test.fasta"

run_id = openfold_client.run_inference(weight_set, model_name, inference_input=input_file)

Screenshots

Screenshot 2024-08-27 at 6 54 49 PM Screenshot 2024-08-27 at 6 54 17 PM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant